Rebrickable - Reproducible Report

Introduction

Authors

Tymon Dydowicz 151936
Wojtek Cieśla 151957

Description

To create this report we made use of freely available Rebrickable dataset which describes Lego sets, pieces and interactions between them. Our main goal while creating this report was to create many enjoyable and aesthetically appealing visualizations as well as to present the enormous size of Lego industry. Most of the charts are quite hard to interpret from the get go and require some exploring which was made easy with plotly and interactiveness.

Analysis

Top 20 Parts by quantity

Simple bar chart showing top 20 parts by quantity. The most popular part is a simple 1x1 brick which is used in almost every set.
Which is of no surprise because this piece is very versatile and can be used in many different ways.
3023_Brick

You can see that some parts are counted in hundreds of thousands which is a lot. And serves well to show how big the Lego industry is.

Number of Parts per year for a given theme

This is a bubble chart showing number of parts per year for a given theme.  Size of a bubble represents number of parts in a set and each color of the rainbow represents a different theme.
I particularly like this visualization due to how colorful it is. Even though it’s a bit cluttered at a beggining it creates a nice visual effect and it’s not an issue due to the ability to zoom in

Colorful Lego

This one is nothing special, grid of pie charts each showing a color distribution of a given theme.
It lets you see which themes are colorful and which are monochromatic, or dominated by shades of gray.
Not particularly visually appealing but it’s a good way to show a lot of data in a small space, and provide some maybe useful insight into the colors of themes.

Tree map of theme hierarchy and percantage of spare parts in sets

Both these treemaps present hierarchy of themes and how many spare parts are in a set belonging to a given theme.
The size of rectangle represents number of parts in a given theme and the hue of a color represents percentage of spare parts.
First one although a bit cluttered is easy to instantly extract some conclusions from. Such as that educational sets have almost non spare parts probably because they are not needed in educational projects which are usually bought by schools. Seasonal sets have way more spare parts compared to other themes probably to lower risks of consument dissatisfaction had the parts been missing.
Second one is cleaner and interactive but it’s hard to draw ideas from however it serves as a good exploration tool.

Version 1

Version 2

Distribution of theme popularity

This graph shows distribution of released sets in particular year for a given theme.
years are on the x-axis while themes are on y-axis, the height of the distribution curve represents how many sets were released in given year for that theme.
It’s an example of both clean and easy to interpret visualization. Easy take aways are f.e that ‘gear’ has been on popularity rise since 2000 and is a the most popular theme or that ‘bionicle’ after it’s initial boom in 2002 died early on.

Minifigure relations

My magnum opus of this project.
This network graph shows relations between minifigures.
Every node is a minifigure, the size of the node represents how many sets that minifigure appeared in and the depth of green color represents how many themes it appeared in.
Two minifigures are connected if they appeared in at least one common theme.
This graph is interesting, pleasing to look at and can provide some interesting insights, and thanks to it’s interactivity it’s easy to explore.
Straight from the get go we can figure out which minifigure is most generic bo looking at amount of links coming out from the node.
With further analysis we can find fully connected graphs of nodes which can mean with high probability that those minifigures are from the same theme and that there exist some protagonist.

Conclusion

This project was a great opportunity to learn about data visualization.
I’ve learned a lot about different types of graphs and how to use them to present data. I hope that I managed to make them eye-catching, nice to look at and interesting.